Risk Prediction Score for Chronic Kidney Disease in Healthy Adults and Adults With Type 2 Diabetes: Systematic Review

Introduction Chronic kidney disease (CKD) is an important public health problem. In 2017, the global prevalence was estimated at 9.1%. Appropriate tools to predict the risk of developing CKD are necessary to prevent its progression. Type 2 diabetes is a leading cause of CKD; screening the population living with the disease is a cost-effective solution to prevent CKD. The aim of our study was to identify the existing prediction scores and their diagnostic accuracy for detecting CKD in apparently healthy populations and populations with type 2 diabetes. Methods We conducted an electronic search in databases, including Medline/PubMed, Embase, Health Evidence, and others. For the inclusion criteria we considered studies with a risk predictive score in healthy populations and populations with type 2 diabetes. We extracted information about the models, variables, and diagnostic accuracy, such as area under the receiver operating characteristic curve (AUC), C statistic, or sensitivity and specificity. Results We screened 2,359 records and included 13 studies for healthy population, 7 studies for patients with type 2 diabetes, and 1 for both populations. We identified 12 models for patients with type 2 diabetes; the range of C statistic was from 0.56 to 0.81, and the range of AUC was from 0.71 to 0.83. For healthy populations, we identified 36 models with the range of C statistics from 0.65 to 0.91, and the range of AUC from 0.63 to 0.91. Conclusion This review identified models with good discriminatory performance and methodologic quality, but they need more validation in populations other than those studied. This review did not identify risk models with variables comparable between them to enable conducting a meta-analysis.


Introduction
Chronic kidney disease (CKD) has been defined as abnormalities of kidney structure or function present for more than 3 months (1). CKD is a public health problem (2-4). According to data from the Global Burden of Disease (GBD) study, in 2017 (4) the prevalence of CKD was estimated at 9.1% globally. Of total mortality, 4.6% of deaths were attributable to CKD and cardiovascular disease (CVD), which was attributable to impaired kidney function. and lead were risk factors for CKD quantified in GBD. Approximately 31% of CKD disability-adjusted life years were attributable to diabetes (4).
After automatic reporting of the glomerular filtration rate (eGFR) began, referrals to nephrology specialists by primary care services increased. However, the proportion of appropriate referrals did not change, indicating a need to develop appropriate screenings for CKD (5). Persons living with hypertension, diabetes, or cardiovascular diseases should be screened for CKD; identifying and treating CKD would reduce the burden of kidney disease (6). CKD can be detected early through inexpensive interventions (4).
Echouffo-Tcheugui and Kengne presented a systematic review with 30 models predicting the occurrence of CKD and concluded that some models had acceptable discriminatory performance (7). CKD screening in groups at high risk is likely to be cost-effective. Predictive models that incorporate clinical information systems would facilitate improved treatment allocations and health care management (6,8).
The aim of our study was to identify the existing prediction risk scores and their diagnostic accuracy for detecting CKD in apparently healthy adults and adults living with type 2 diabetes.

Methods
We followed the methodology proposed by the Cochrane handbook for systematic reviews of Diagnostic Test Accuracy (DTA). The protocol was published at PROSPERO (https:// www.crd.york.ac.uk/prospero/), registration number CRD42021252888.
A search strategy was designed for the following databases: Cochrane Library, Medline/PubMed, Embase, Latin American and Caribbean Health Sciences Literature (LILACS), Cumulative Index to Nursing and Allied Health Literature (CINAHL), PsycInfo, Trip Database, Epistemonikos, and Health Evidence. We used the medical subject heading (MeSH) term "renal insufficiency, chronic" and the terms "risk models" and "predictive models," which were validated in a pilot search. The detailed search strategy is in the Appendix. The databases used mostly had artificial intelligence that helped to mix these terms with similar terms. All the records were screened by title and abstract, then assessed by full text. We finally selected the ones that met all the selection criteria. The screening process included the reference list of the studies included in this review, other similar reviews, and a manual search of other studies identified for the authors in previous searches.

Study selection
The inclusion criteria included cohort and cross-sectional studies without language restrictions. The search was intentionally limited from May 2011 to November 2021 to update the information provided in previous reviews.
Studies that included healthy adults and adults living with type 2 diabetes were incorporated. The exclusion criteria were studies where the database used was from hospitalized patients with an initial diagnosis of CKD, and patients living with type 1 diabetes.
Included articles had to report models as a risk assessment tool that predicted CKD in healthy adults or adults living with type 2 diabetes. In this review we excluded predictive models of mortality, progression of CKD, and machine learning technology.
Major outcomes that we sought were area under the receiver operating characteristic curve (AUC) or C statistic to predict the presence or occurrence of CKD in healthy adults and adults with type 2 diabetes. Secondary outcomes that we looked for were sensitivity and specificity to predict the presence or occurrence of CKD in healthy adults and adults with type 2 diabetes. Studies that we included compare their models with reference standards eGFR, albuminuria, or proteinuria.
Two authors of this review (A.G.-R. and V.C.) independently screened titles and abstracts to identify relevant articles. In the first step of this process, reviews were removed, then full texts of the remaining articles were systematically examined for inclusion or exclusion. In the event of disagreement, the participation of the third author (E.D.-G.) was necessary to decide whether to include the article. The selection stages are shown in the flowchart based on Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines ( Figure 1). Figure 1. Selection of studies process for analysis of chronic kidney disease (CKD) in healthy adults and adults living with type 2 diabetes. Abbreviations: CINAHL, Cumulative Index to Nursing and Allied Health Literature; CVD, cardiovascular disease; eGFR, glomerular filtration rate; ESRD, end-stage renal disease; LILACS, Latin American and Caribbean Health Sciences Literature.
The information extracted was the design of the studies, type of population studied, type of prediction model and its variables, type of statistical analysis, type of reference standard, and outcomes. We also separated studies by training, development, and external validation models. The data were obtained in duplicate by V.C. and A.G.-R. and corroborated by E.D.-G. The extraction and analysis were performed by separating healthy populations and type 2 diabetes populations; the results are presented as separate groups.
Two reviewers (V.C. and A.G.-R.) independently assessed the risk of bias with Quality Assessment of Diagnostic Accuracy Studies 2 (QUADAS-2), guided by the Cochrane Handbook for Systematic Reviews of Diagnostic Test Accuracy. We used the software tool RevMan 5.4 (Cochrane).
The QUADAS-2 tool evaluated 4 principal domains: 1) patient selection, 2) index test, 3) reference standard, and 4) flow and timing. The applicability concerns were evaluated in 3 domains: 1) patient selection, 2) index test, and 3) reference standard. Each potential bias and concern was graded as high, low, or unclear risk. Risk of bias was evaluated by A.G.-R. and V.C.

Statistical analysis and data synthesis
To synthesize the information, we divided it by populationhealthy population and type 2 diabetes population -and looked for homogeneity in the baseline characteristics of the participants of the studies and the possible risk factors. Nevertheless, because of the heterogeneity of metrics and variables used to assess the predictive ability of CKD risk models, we conducted a qualitative synthesis of the full evidence instead of a meta-analysis.

Search results
An electronic search was conducted in May 2021 and updated in January 2022. We identified 2,359 records by searching databases, registers, and other sources; 9 of the studies were identified by doing manual searching, and 1 was from gray literature. From the 2,359 we removed 29 duplicate records. In the screening stage, 1,691 records that did not meet the inclusion criteria were eliminated. Two studies were not retrieved because they were not fully published, leaving 31 studies retrieved for full-text analysis. Of these, 17 were eliminated because they 1) were prediction models for advanced CKD, 2) were about prevalence of CKD, 3) had insufficient data for analysis, or 4) had a sample that mixed type 2 diabetes and type 1 diabetes populations (9). From other methods (reference scanning, manual searching, and gray literature), 7 studies were retrieved. Finally, for the qualitative analysis 21 studies were included: 13 studies (10-22) with prediction models to assess the presence or occurrence of CKD in healthy adults, 7 studies (23-28, and one unpublished paper [A. Raña-Custodio, M. Lajous, E. Denova-Gutiérrez, M. Chávez-Cárdenas, R. Lopez-Ridaura, and G. Danaei, personal communication, 2023]) with prediction model to assess the presence or occurrence of CKD in people with type 2 diabetes, and 1 study including a model for both populations (29). In those studies, we identified 48 different models.
The main characteristics of the risk predictive models for CKD developed in each study are described in Table 1. Fourteen studies were developed by using prospective cohort data, 4 were developed by using cross-sectional data, and 3 were developed by using retrospective cohort data. Of the total studies included, 13 studies' outcome results were calculated with C statistic, 10 studies with AUC, and 7 studies reported sensitivity and specificity analysis.

Healthy population risk scores
We synthesized information from 14 studies (10-22,29) that developed equations to detect the risk of CKD in healthy popula-PREVENTING CHRONIC DISEASE VOLUME 20, E30 PUBLIC HEALTH RESEARCH, PRACTICE, AND POLICY APRIL 2023 The opinions expressed by authors contributing to this journal do not necessarily reflect the opinions of the U.S. Department of Health and Human Services, the Public Health Service, the Centers for Disease Control and Prevention, or the authors' affiliated institutions.
tions; 36 different predictive models were identified. Three models used 5 risk factors (11,18,21); the number of factors included in the models ranged from 3 (18) and 146 variables (12). Some of the most common risk factors included for the predictive equation were age, sex, type 2 diabetes (glycated hemoglobin A 1c [HbA 1c ], fasting plasma glucose, or history of diabetes), kidney function (eGFR, proteinuria, or albuminuria), cardiovascular disease (systolic blood pressure, diastolic blood pressure, or history of hypertension), and obesity (waist circumference or body mass index). For the reference standard, all studies used eGFR, with the cutoff established by Kidney Disease Improving Global Outcomes (KDIGO) guidelines (1). Additionally, 4 studies (15,18,20,29) conducted an external validation.
The sensitivity range was 50.3 to 89.4; the highest sensitivity value was from the Kwon et al model (15) that incorporated age, sex, anemia, hypertension, type 2 diabetes, cardiovascular disease, and proteinuria. The specificity values range was 0.51 to 97.3; the highest specificity was the derivation model (12) that included 146 variables.

Type 2 diabetes risk scores
We synthesized information from 8 studies (23-29 and 1 unpublished paper [A. Raña-Custodio, M. Lajous, E. Denova-Gutiérrez, M. Chávez-Cárdenas, R. Lopez-Ridaura, and G. Danaei, personal communication, 2023]) that developed risk equations for people with type 2 diabetes, analyzing 11 different models. Most of the predictive models used at least 5 risk factors; the number of factors in the models ranged from 5 (23,24) to 16 variables (Raña-Custodio et al, personal communication, 2023). Some of the most common risk factors included for the predictive equation were age, sex, eGFR, and HbA 1c .
In 6 studies (23-25,28,29, and unpublished study), outcome accuracy was calculated with C statistics; in 2 studies, with AUC (26,27); and 2 reported sensitivity and specificity ( Table 2). The range of C statistics was 0.5 to 0.8; the highest AUC was 0.83. The highest C statistic was from the external validation of the diabetic model (29); this model included age, sex, race, ethnicity, eGFR, history of cardiovascular disease, ever smoker, hypertension, body mass index, albuminuria, diabetes medications (insulin vs only oral medications vs none), and HbA 1c . The range of sensitivity was 64.5 to 75.6; the highest sensitivity was also the external validation of diabetic model. The specificity range was 46.5 to 72.3; the highest specificity was the model developed test data set (26).

Methodologic quality of included studies
Of all studies included in the synthesis, in the patient selection domain 81% had low risk of bias and 76% had low applicability concern ( Figure 2). Dunkler et al had unclear concern of bias because the sample included people receiving pharmacologic therapy (24). Hippisley-Cox and Coupland had high applicability concern because the study population had moderate CKD, recorded by kidney transplants and record of kidney dialysis (13). In the index test domain, 86% had low risk of bias and 100% had low applicability concern. In the reference standard domain, 71% had low risk of bias; Saranburut et al used an outcome from a modification of the KDIGO definition (18). Also in the reference standard domain, 90% had low applicability concern. For the flow and timing domain, 95% had low risk of bias.

Discussion
The aim of this systematic review was to identify the existing prediction scores and their diagnostic accuracy for detecting CKD. Thus, we identified 48 different predictive models in 21 total studies of healthy people and people with type 2 diabetes. For healthy populations, we analyzed 14 studies presenting 36 predictive scores for CKD and a wide range (4 to 146) of variables considered by each author. Populations with type 2 diabetes were summarized in 8 studies presenting 15 different models with a range of 4 to 16 variables.
Evaluating the accuracy of these models is a cornerstone to find the best but also reachable way to predict the risk of CKD. In our study, we identified for the healthy population 11 models predicting CKD above 0.8 AUC, considered as good discriminatory performance (30).
This review discords with another review (31). However, by using techniques with a specific tool (QUADAS-2), there are predictive models with good accuracy and quality. For example, Al-Shamsi et al (10) presented a stepwise model for a healthy population with an AUC of 0.9 (95% CI, 0.8-0.9) using variables that are simple and reliable in primary care (eGFR, diabetes, cholesterol, and HbA 1c ), with low risk of bias and low applicability concern. Also, Yu et al (22) presented a sex-specific model with AUC over 0.9 for both sexes, with low risk of bias and low applicability concern.
For the population with type 2 diabetes, Low et al (26), with 0.8 AUC, 75.6 sensitivity, and 72.3 specificity, had the highest accuracy, considered as good discriminatory performance. This risk score includes variables log albumin-to-creatinine ratio, systolic blood pressure, HbA 1c , eGFR, low-density lipoprotein cholesterol, and age, with low risk of bias and applicability concern. The presence of type 2 diabetes is one of the main risk factors for developing CKD; identifying the population at higher risk is vital for public health. In both populations, the risk models with highest accuracy had HbA 1c and eGFR variables in common. Chadban et al recognized that blood glucose plays a significant role in the development of CKD (32), and that is shown in the predictive model that includes a variable related to blood glucose.
These models present a wide heterogeneity between the variables included, similar to findings in other reviews (7,31). Regardless, the heterogeneity found between these predictive equations had common variables: age, hypertension-related variables (systolic blood pressure or diastolic blood pressure), body mass index, and diabetes-related variables (history of diabetes, HbA 1c , glucose). In agreement with other authors (7), we found that using predictive models with feasible variables in primary care could help professionals from this level of health care alert the population at risk. Also, looking through these variables gave us a chance to look at prevention therapies that control the progression of these variables, as reflected in the progression of CKD. CKD risk predictive models should be applied mainly in populations with risk factors for CKD susceptibility, initiation, or progression (33). To our knowledge, this is the first systematic review of risk prediction models for CKD that looked at a risk of bias with a validated tool for diagnostic accuracy such as QUADAS-2 to test studies. Conducting a risk of bias analysis is important in a systematic review, because after the accuracy of the studies is identified, the methodologic quality plays an important role for future research and for the populations affected. Also, this review presents results for general populations and for populations with type 2 diabetes that are at higher risk for CKD.

PREVENTING CHRONIC DISEASE
As a limitation, this review did not identify risk models with variables comparable between them to conduct a meta-analysis. Therefore, it is not possible to make recommendations for the use of the models in other populations. We suggest that future work validate in different populations the existing scores and obtain comparable data to make recommendations.
To synthetize the existing models in this report, we gave public health researchers and clinicians a wide view of existing models. From the models they can choose the one that best applies to their population with regard to their accuracy and their methodologic quality.

Conclusions
We synthesized risk models to detect CKD in healthy and type 2 diabetes patients. Of those, 11 models for healthy populations and 3 for type 2 diabetes patients were identified with good discriminatory performance and methodologic quality. The development of these models, using all those different variables, gives a wide observation of the accuracy and the risk factors. The burden of CKD is increasing in both absolute and relative terms; identifying models that can help to predict the risk of CKD could be the first step to prevent CKD and inform the population. These models are important in primary care settings to help identify people at risk and promptly start prevention or treatment. Some of these models had variables easily obtained at primary care services, improving the accuracy in screening the population at risk and referring patients to a specialist as needed. Finally, these tools need to be externally validated to identify their accuracy in other populations, to provide more information to affected populations regarding public policies about the risk of incident CKD.